33 research outputs found

    Proof by analogy in mural

    Get PDF
    One of the most important advantages of using a formal method of developing software is that one can prove that development steps are correct with respect to their specification. Conducting proofs by hand, however,can be time consuming to the extent that designers have to judge whether a proof of a particular obligation is worth conducting. Even if hand proofs are worth conducting, how do we know that they are correct? One approach to overcoming this problem is to use an automatic theorem proving system to develop and check our proofs. However, in order to enable present day theorem provers to check proofs, one has to conduct them in much more detail than hand proofs. Carrying out more detailed proofs is of course more time consuming. This paper describes the use of proof by analogy in an attempt to reduce the time spent on proofs. We develop and implement a proof follower based on analogy and present two examples to illustrate its characteristics. One example illustrates the successful use of the proof follower. The other example illustrates that the follower's failure can provide a hint that enables the user to complete a proof

    A cost-sensitive decision tree learning algorithm based on a multi-armed bandit framework

    Get PDF
    This paper develops a new algorithm for inducing cost-sensitive decision trees that is inspired by the multi-armed bandit problem, in which a player in a casino has to decide which slot machine (bandit) from a selection of slot machines is likely to pay out the most. Game Theory proposes a solution to this multi-armed bandit problem by using a process of exploration and exploitation in which reward is maximized. This paper utilizes these concepts to develop a new algorithm by viewing the rewards as a reduction in costs, and utilizing the exploration and exploitation techniques so that a compromise between decisions based on accuracy and decisions based on costs can be found. The algorithm employs the notion of lever pulls in the multi-armed bandit game to select the attributes during decision tree induction, using a look-ahead methodology to explore potential attributes and exploit the attributes which maximizes the reward. The new algorithm is evaluated on fifteen datasets and compared to six well-known algorithms J48, EG2, MetaCost, AdaCostM1, ICET and ACT. The results obtained show that the new multi-armed based algorithm can produce more cost-effective trees without compromising accuracy. The paper also includes a critical appraisal of the limitations of the new algorithm and proposes avenues for further research

    CSNL: A cost-sensitive non-linear decision tree algorithm

    Get PDF
    This article presents a new decision tree learning algorithm called CSNL that induces Cost-Sensitive Non-Linear decision trees. The algorithm is based on the hypothesis that nonlinear decision nodes provide a better basis than axis-parallel decision nodes and utilizes discriminant analysis to construct nonlinear decision trees that take account of costs of misclassification. The performance of the algorithm is evaluated by applying it to seventeen datasets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the datasets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using nonlinear decision nodes. The performance of the algorithm is evaluated by applying it to seventeen data sets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the data sets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using non-linear decision nodes

    A New English/Arabic Parallel Corpus for Phishing Emails

    Get PDF
    Phishing involves malicious activity whereby phishers, in the disguise of legitimate entities, obtain illegitimate access to the victims’ personal and private information, usually through emails. Currently, phishing attacks and threats are being handled effectively through the use of the latest phishing email detection solutions. Most current phishing detection systems assume phishing attacks to be in English, though attacks in other languages are growing. In particular, Arabic is a widely used language and therefore represents a vulnerable target. However, there is a significant shortage of corpora that can be used to develop Arabic phishing detection systems. This paper presents the development of a new English-Arabic parallel phishing email corpus that has been developed from the anti-phishing share task text (IWSPA-AP 2018). The email content was to be translated, and the task had been allotted to 10 volunteers who had a university background and were English and Arabic language experts. To evaluate the effectiveness of the new corpus, we develop phishing email detection models using Term Frequency–Inverse Document Frequency (TF-IDF) and Multilayer Perceptron using 1258 emails in Arabic and English that have equal ratios of legitimate and phishing emails. The experimental findings show that the accuracy reaches 96.82% for the Arabic dataset and 94.63% for the emails in English, providing some assurance of the potential value of the parallel corpus developed

    A social norms approach to changing school children’s perceptions of tobacco usage

    Get PDF
    Purpose: Over 200,000 young people in the UK embark on a smoking career annually, thus continued effort is required to understand the types of interventions that are most effective in changing perceptions about smoking amongst teenagers. Several authors have proposed the use of Social Norms programmes, where correcting misconceptions of what is considered normal behaviour lead to improved behaviours. There are a limited number of studies showing the effectiveness of such programmes for changing teenagers’ perception of smoking habits, and hence this paper reports on the results from one of the largest Social Norms programmes that used a variety of interventions aimed at improving teenagers’ perceptions of smoking. Design/methodology/approach: A range of interventions was adopted for 57 programmes in Year 9 students, ranging from more passive interventions such as posters and banners to more active interventions such as student apps and enterprise days. Each programme consisted of a baseline survey followed by interventions and a repeat survey to calculate changes in perception. A clustering algorithm was also used to reveal the impact of combinations of interventions. Findings: The study reveals three main findings: (i) the use of social norms is an effective means of changing perceptions (ii) the level of interventions and change in perceptions are positively correlated and (iii) that the most effective combinations of interventions include the use of interactive feedback assemblies, enterprise days, parent and student apps and newsletters to parents. Originality/value: The paper presents results from one of the largest social norm programmes aimed at improving young people’s perceptions and is the first to use clustering methods to reveal the impact of combinations of intervention

    Inducing safer oblique trees without costs

    Get PDF
    Decision tree induction has been widely studied and applied. In safety applications, such as determining whether a chemical process is safe or whether a person has a medical condition, the cost of misclassification in one of the classes is significantly higher than in the other class. Several authors have tackled this problem by developing cost-sensitive decision tree learning algorithms or have suggested ways of changing the distribution of training examples to bias the decision tree learning process so as to take account of costs. A prerequisite for applying such algorithms is the availability of costs of misclassification. Although this may be possible for some applications, obtaining reasonable estimates of costs of misclassification is not easy in the area of safety. This paper presents a new algorithm for applications where the cost of misclassifications cannot be quantified, although the cost of misclassification in one class is known to be significantly higher than in another class. The algorithm utilizes linear discriminant analysis to identify oblique relationships between continuous attributes and then carries out an appropriate modification to ensure that the resulting tree errs on the side of safety. The algorithm is evaluated with respect to one of the best known cost-sensitive algorithms (ICET), a well-known oblique decision tree algorithm (OC1) and an algorithm that utilizes robust linear programming
    corecore